366 research outputs found

    Pipeline Implementation of Peer Group Filtering in FPGA

    Get PDF
    In the paper a parallel FPGA implementation of the Peer Group Filtering algorithm is described. Implementation details, results, performance of the design and FPGA logic resources are discussed. The PGF algorithm customized for FPGA is compared with the original one and Vector Median Filtering

    Real-time Foreground Object Detection Combining the PBAS Background Modelling Algorithm and Feedback from Scene Analysis Module

    Get PDF
    The article presents a hardware implementation of the foreground object detection algorithm PBAS (Pixel-Based Adaptive Segmenter) with a scene analysis module. A mechanism for static object detection is proposed, which is based on consecutive frame differencing. The method allows to distinguish stopped foreground objects (e.g. a car at the intersection, abandoned luggage) from false detections (so-called ghosts) using edge similarity. The improved algorithm was compared with the original version on popular test sequences from the changedetection.net dataset. The obtained results indicate that the proposed approach allows to improve the performance of the method for sequences with the stopped objects. The algorithm has been implemented and successfully verified on a hardware platform with Virtex 7 FPGA device. The PBAS segmentation, consecutive frame differencing, Sobel edge detection and advanced one-pass connected component analysis modules were designed. The system is capable of processing 50 frames with a resolution of 720 × 576 pixels per second.

    Optimisation of the PointPillars network for 3D object detection in point clouds

    Full text link
    In this paper we present our research on the optimisation of a deep neural network for 3D object detection in a point cloud. Techniques like quantisation and pruning available in the Brevitas and PyTorch tools were used. We performed the experiments for the PointPillars network, which offers a reasonable compromise between detection accuracy and calculation complexity. The aim of this work was to propose a variant of the network which we will ultimately implement in an FPGA device. This will allow for real-time LiDAR data processing with low energy consumption. The obtained results indicate that even a significant quantisation from 32-bit floating point to 2-bit integer in the main part of the algorithm, results in 5%-9% decrease of the detection accuracy, while allowing for almost a 16-fold reduction in size of the model.Comment: 7 pages, 2 figures, submitted to SPA 2020 conferenc

    The VINEYARD Approach: Versatile, Integrated, Accelerator-Based, Heterogeneous Data Centres.

    Get PDF
    Emerging web applications like cloud computing, Big Data and social networks have created the need for powerful centres hosting hundreds of thousands of servers. Currently, the data centres are based on general purpose processors that provide high flexibility buts lack the energy efficiency of customized accelerators. VINEYARD aims to develop an integrated platform for energy-efficient data centres based on new servers with novel, coarse-grain and fine-grain, programmable hardware accelerators. It will, also, build a high-level programming framework for allowing end-users to seamlessly utilize these accelerators in heterogeneous computing systems by employing typical data-centre programming frameworks (e.g. MapReduce, Storm, Spark, etc.). This programming framework will, further, allow the hardware accelerators to be swapped in and out of the heterogeneous infrastructure so as to offer high flexibility and energy efficiency. VINEYARD will foster the expansion of the soft-IP core industry, currently limited in the embedded systems, to the data-centre market. VINEYARD plans to demonstrate the advantages of its approach in three real use-cases (a) a bio-informatics application for high-accuracy brain modeling, (b) two critical financial applications, and (c) a big-data analysis application

    Efficient Hardware Implementation of the Horn-Schunck Algorithm for High-Resolution Real-Time Dense Optical Flow Sensor

    No full text
    This article presents an efficient hardware implementation of the Horn-Schunck algorithm that can be used in an embedded optical flow sensor. An architecture is proposed, that realises the iterative Horn-Schunck algorithm in a pipelined manner. This modification allows to achieve data throughput of 175 MPixels/s and makes processing of Full HD video stream (1; 920 × 1; 080 @ 60 fps) possible. The structure of the optical flow module as well as pre- and post-filtering blocks and a flow reliability computation unit is described in details. Three versions of optical flow modules, with different numerical precision, working frequency and obtained results accuracy are proposed. The errors caused by switching from floating- to fixed-point computations are also evaluated. The described architecture was tested on popular sequences from an optical flow dataset of the Middlebury University. It achieves state-of-the-art results among hardware implementations of single scale methods. The designed fixed-point architecture achieves performance of 418 GOPS with power efficiency of 34 GOPS/W. The proposed floating-point module achieves 103 GFLOPS, with power efficiency of 24 GFLOPS/W. Moreover, a 100 times speedup compared to a modern CPU with SIMD support is reported. A complete, working vision system realized on Xilinx VC707 evaluation board is also presented. It is able to compute optical flow for Full HD video stream received from an HDMI camera in real-time. The obtained results prove that FPGA devices are an ideal platform for embedded vision systems
    corecore